LINGUISTIC DESCRIPTION IN DICTIONARIES: SEMANTICS Towards a filtering of the relevant semantic information from MRDs

نویسندگان

  • Toni BADIA
  • Roser SAURÍ
چکیده

Machine-readable dictionaries (henceforth MRDs) are a valuable source of lexical data for the automatic construction of lexical knowledge bases (LKBs). However, the transfer of information from one source to the other one often demands a filtering process, since in many cases the original dictionaries contain mistakes, redundancies and granularity irregularities. We claim that this filtering process has to be linguistically oriented and, to this end, we start from an amended Generative Lexicon (GL) as a framework for the representation and treatment of lexical information. The use of such a frame allows us to distinguish two main kinds of polysemy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Categorization of Semantics of Fashion Language: A Memetic Approach

Categories are not invariant. This paper attempts to explore the dynamic nature of semantic category, in particular, that of fashion language, based on the cognitive theory of Dawkins’ memetics, a new theory of cultural evolution. Semantic attributes of linguistic memes decrease or proliferate in replication and spreading, which involves a dynamic development of semantic category. More specific...

متن کامل

Extraction of Semantic Clusters for Terminological Information Retrieval from MRDs

This paper describes a semantic clustering method for data extracted from machine readable dictionaries (MRDs) in order to build a terminological information retrieval system that finds terms from descriptions of concepts. We first examine approaches based on ontologies and statistics, before introducing our analogy-based approach that lets us extract semantic clusters by aligning definitions f...

متن کامل

Acquiring and Representing Semantic Information in a Lexical Knowledge Base

The paper focuses on the description of the approach, taken within the ESPRIT BRA project ACQUILEX, towards: i) acquisition of semantic information from several machinereadable dictionaries (in four languages), and ii) its representation in a common Lexical Knowledge Base. Knowledge extraction is guided by a) empirical observations and b) theoretical hypotheses. As for representation, we stress...

متن کامل

The Semantics of the Word Istikbar (Arrogance) in the Holy Quran based on Syntagmatic Relations(A Case Study of Semantic Proximity and Semantic Contrast)

The word istikbar (arrogance) is one of the key words in the monotheistic system of the Quran, which has found a special status as a special feature of the opponents and adversaries of the call to the truth. Given the prominent role of this issue in the human life system and its provision of corruption and moral deviations, it is necessary to represent the nature of the elements that make up th...

متن کامل

Combining Corpus and Machine - ReadableDictionary Data for Building Bilingual

This paper describes and discusses some theoretical and practical problems arising from developing a system to combine the structured but incomplete information from machine readable dictionaries (MRDs) with the unstructured but more complete information available in corpora for the creation of a bilingual lexical data base, presenting a methodology to integrate information from both sources in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000